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Abstract 



The nitrilase superfamily consists of thiol enzymes involved in natural product biosynthesis and post- 
translational modification in plants, animals, fungi and certain prokaryotes. On the basis of sequence 
similarity and the presence of additional domains, the superfamily can be classified into 1 3 branches, 
nine of which have known or deduced specificity for specific nitrile- or amide-hydrolysis or amide- 
condensation reactions. Genetic and biochemical analysis of the family members and their associated 
domains assists in predicting the localization, specificity and cell biology of hundreds of uncharacterized 
protein sequences. 



Plants, animals and fungi perform a wide variety of nonpep- 
tide carbon-nitrogen hydrolysis reactions using members of 
the nitrilase superfamily of enzymes. These nitrilase [1,2] and 
amidase [3,4] reactions, which produce auxin, biotin, 
(5-alanine and other natural products, and which result in 
deamination of protein and amino acid substrates, all involve 
attack of a cyano or carbonyl carbon by a conserved cysteine 
[5,6]. Many bacteria and archaea, particularly those with an 
ecological relationship to plants and animals, encode 
members of the nitrilase superfamily and utilize the enzymes 
for chemically similar nitrile or amide hydrolysis reactions or 
for condensation of acyl chains to polypeptide amino termini. 

On the basis of global and structure-based sequence analysis, 
members of the nitrilase superfamily can now be classified 
into 13 branches and the substrate specificity of members of 
nine branches can be anticipated. Despite historical classifi- 
cation of all of these sequences as nitrilase-related, only one 
branch is known to have nitrilase activity, whereas eight 
branches have apparent amidase or amide-condensation 
activities. Members of seven branches of the nitrilase super- 
family have participated in domain fusion events that alter 
the localization of the nitrilase-related domain, link ammonia 
production to ammonia consumption, or potentially link pro- 
teins involved in cellular signaling. For example, fusion of 



domains we expect to have glutamine amidohydrolase (GAT) 
activity to some bacterial and all eukaryotic nicotinamide 
adenine dinucleotide (NAD) synthetases can account for the 
previously unsolved problem that only some NAD syn- 
thetases use glutamine as a source of ammonia [7-9]- 
Remarkably, these fusions contain the fourth apparent GAT 
domain involved in coupled amide transfer reactions as they 
are unrelated to other GAT-domain-containing families: the 
amino-terminal nucleophile (Ntn) hydrolases and triad ami- 
dotransferases [10], and the amidase signature family [ill. 
Crystal structures of two nitrilase superfamily members - 
worm NitFhit [12] and a bacterial iV-carbamyl-D-amino acid 
amidohydrolase [13] - reveal that nitrilase-related proteins 
are multimeric ot-p-p-a sandwich proteins that have a con- 
served Glu-Lys-Cys catalytic triad responsible for covalent 
catalysis. Mutating catalytic triad residues may allow sub- 
strates to be trapped and identified for the branches that 
remain to be characterized biochemically. 

Evolution and classification 

Members of the nitrilase superfamily appear to be found in 
all plants, animals and fungi, and many of these organisms 
have multiple nitrilase-related proteins from more than one 
branch of the superfamily. Nitrilase-related sequences are 
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also found in phylogenetically isolated prokaryotes that 
appear to have an ecological relationship to plants and 
animals. The nitrilase superfamily therefore probably 
emerged prior to the separation of plants, animals and fungi, 
radiated into families, and then spread laterally to bacteria 
and archaea. Some branches of the nitrilase superfamily are 
found only in prokaryotes; members of these branches may 
constitute rational antibiotic targets. 

Automated sequence searching easily identifies predicted 
polypeptides as members of the nitrilase superfamily, but 
many database annotations have been applied haphazardly. 
Because members of the nitrilase superfamily are reported 
to be nitrilases, aliphatic amidases, (i-ureidopropionases, 
p-alanine synthases, JV-carbamyl-D-amino acid amidohydro- 
lases and so on, these designations appear in the sequence- 
definition lines of multiple databases, often irrespective of the 
activity of the most closely related characterized enzyme. 

The reactions performed by nitrilases, amidases, carbamy- 
lases and N-acyl transferases within the nitrilase superfam- 
ily are shown schematically in Figure i. It should be noted 
that the nitrilase branch of the nitrilase superfamily may be 
the only branch that contains members that perform nitrile 
hydrolysis (from a nitrile to the corresponding acid plus 
ammonia); at least eight branches appear to be either ami- 
dases of various specificities or enzymes that condense acyl 
chains to amino groups. Nitrile hydratases, metal-contain- 
ing enzymes that convert a nitrile to the corresponding 
amide [14], are not members of the nitrilase superfamily. 
Additionally, despite the fact that most branches of the 
nitrilase superfamily are actually amidases, there are many 
amidases including Ntn and triad hydrolases [10], amidase 
signature enzymes [15] and thiol proteases [16] that are 
unrelated to the nitrilase superfamily. Because of the histor- 
ical observation that aliphatic amidases are related to nitri- 
lases [4,6], we retain 'nitrilase' as the superfamily 
designation and as a branch designation, and embrace 
several families of homologous Glu-Lys-Cys amidases as 
branches of the nitrilase superfamily. 

We performed a large number of BLASTp (version 2.1.2) [17] 
and manual searches to identify prototypical members of 
branches of the nitrilase superfamily and we currently clas- 
sify the superfamily as having 13 branches, shown in Table 1. 
For the data uniquely classifying nitrilase sequences into 13 
branches, see the Additional data file available with the 
online version of this article. Examination of the E-values of 
sequences aligned with a prototype guided the classification 
of each of the 176 identified sequences as a member of only 
one branch. Within most branches, there is a relatively sharp 
cutoff in E-values such that sequences with E-values greater 
than 1 x 10-25 can be identified as belonging to another 
branch. In the 13th branch, definition of a prototype - a 
sequence to which all branch members can be easily 
compared - was less straightforward as the sequences are 



relatively diverse. With more data, it would not be surprising 
to find further ways to divide and to classify members of the 
nitrilase superfamily. 

Most members of each branch can be assigned to the branch 
not only by virtue of an E-value cutoff, but also by virtue of 
signature sequences surrounding active-site residues, pro- 
viding further confidence in the classification scheme. 
Essentially all members of the nitrilase superfamily have a 
conserved, apparent catalytic triad of glutamate, lysine and 
cysteine (only three apparently truncated sequences lack the 
glutamate). The motif that most highly correlates with 
E-value cutoffs consists of the two residues carboxy-terminal 
to the cysteine nucleophile. For example, members of the 
nitrilase branch of the nitrilase superfamily have a Cys-Trp- 
Glu motif at the active-site cysteine, whereas p-ureidopro- 
prionases have a Cys-Tyr-Gly motif. Consensus sequences 
for the glutamate-, lysine- and cysteine-surrounding 
residues of each branch of the nitrilase superfamily are 
shown in Figure 2. 

Domain fusions in the nitrilase superfamily 

In seven branches of the nitrilase superfamily, a nitrilase- 
related domain is fused to at least one additional conserved 
domain (Figure 3). In three branches, the domain fusion 
appears to be constitutive; that is, all members of that branch 
(defined by BLAST E-value and signature sequences within 
the nitrilase-related domains) contain the additional domain. 
In four branches, the additional domain(s) are not found in 
every member. Some of the domain-fusion events can be con- 
sidered 'Rosetta Stone' fusions, in that separate polypeptides 
appear to be fused to coordinate biochemical reactions or cel- 
lular functions [18,19]. Other domain-fusion events appear 
more likely to affect cellular localization. The significance of 
domain fusions in branches 7 and 8, the prokaryotic and 
eukaryotic NAD synthetases, is discussed below. 

Two independently derived families of GAT domains have 
been found in a variety of two-domain polypeptides that 
couple ammonia hydrolysis from glutamine to ammonia 
consumption at a second active site [10]. Glutamine phos- 
phoribosylpyrophosphate (PRPP) amidotransferase is pro- 
totypical of enzymes that utilize a GAT domain related to 
the Ntn hydrolases [20], whereas GMP synthetase is proto- 
typical of enzymes that use a triad amidotransferase 
domain to perform the GAT function [21]. The second 
active site of GMP synthetase contains a nucleotide-binding 
domain similar to that of ammonia-dependent NAD syn- 
thetase [22]. It has been known for more than 30 years that 
Escherichia coli NAD synthetase differs from eukaryotic 
NAD synthetases in that it cannot use glutamine as an 
ammonia source [7]. Although yeast [8] and some prokary- 
otic NAD synthetases [23] are glutamine dependent, they 
do not contain an Ntn or triad GAT domain. The gluta- 
mine-dependent NAD synthetase from Mycobacterium 
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Figure I 

Four types of reaction carried out by nitrilase superfamily members, (a) The nitrilase reaction is performed by branch I 
enzymes. In plants, the substrate is indole-3-acetonitrile and the product is indole-3-acetic acid, (b) The amidase reaction is 
the most frequently observed activity in the superfamily. Branch 2-4 enzymes are amidases and nitrilase-related domains of 
branch 7 and 8 enzymes are proposed to be amidases specific for glutamine. (c) The carbamylase reaction is a special case of 
the amidase reaction, carried out by branch 5 and 6 enzymes, (d) Branch 9 N-acyltransferases perform the amidase reaction 
in reverse, transferring a fatty acid from phospholipid (not shown) to a polypeptide amino terminus. The polypeptide 
acceptor usually contains an ami no-terminal diacylglyceride-modified cysteine (not shown). All nitrilase-related reactions are 
thought to proceed through acylenzyme intermediates. 



tuberculosis, however, contains an amino-terminal domain 
[23] not present in the E. coli enzyme [24]. After the dis- 
covery that the multiprotein Bacillus glutamyl-tRNAGln 
amidotransferase contains yet a third type of GAT activity 
[11] related to the amidase signature family [15], it was 
hypothesized that the amino terminus of the prokaryotic 



glutamine-dependent NAD synthetase is related to the 
amidase signature family [23]. 

In contrast, we find that the amino terminus of prokaryotic 
glutamine-dependent NAD synthetase and the amino-termi- 
nal domains of all eukaryotic NAD synthetases are branches 
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Table I 



Summary of the enzyme activities of the nitrilase superfamily 
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Figure 2 

The nitrilase superfamily catalytic triad motifs. Consensus sequences flanking the invariant catalytic triad residues, glutamate, 
lysine and cysteine, were obtained by doing multiple seauence alignments within each branch [54]. Red letters on a yellow 
background indicate the same residue is conserved in all branches. Dark blue letters on light blue background indicate the 
residue is conserved in nine or more branches. Green background shows positions in which the conserved amino acid is 
found in six to eight of the branches. Upper case letters indicate 90% or greater consensus levels within a branch, whereas 
lower case are 50% or greater. Residue numbers are shown for the prototypical members of branches I to 1 2 and for the 
first listed member of branch 1 3. 
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Figure 3 

Domain structures for 1 3 branches of the nitrilase superfamily. Additional domains are found in members of seven branches. 
Parentheses denote domains found in only some members of the branch. In branch 4, vanins and biotinidases have carboxy- 
terminal domains unique to these two sub-branches and one vanin has additional full and partial nitrilase-related domains. The 
NAD synthetase domains of eukaryotes are always fused with a nitrilase-related domain. In contrast, only some prokaryotic 
NAD synthetases are fusion proteins with a nitrilase-related domain. This led to the prediction that branch 7 and 8 nitrilase 
domains are glutamine amidotransferases for the associated NAD synthetases (see text for details). Apolipoprotein 
N-acyltransferases (branch 9) always have a hydrophobic amino-terminai domain and one member is fused to an apparent 
dolicnol phosphate mannose synthetase, which underscores the proposed function of branch 9 enzymes in post-translational 
modification. Nit proteins, branch 10, are found as fused Rosetta Stone proteins with Fhit in invertebrates and are 
coordinately expressed with separate Fhit proteins in mammals. Branch 1 2 enzymes are predicted to have protein substrates 
as they are fused to a homolog of an ami no-terminal acetyl transferase. 



7 and 8 of the nitrilase superfamily, respectively. We deduce 
that branch 7 and 8 nitrilase-related domains have substrate 
specificity as glutamine amidases, and that branch 7 and 8 
enzymes utilize these novel GAT domains to confer gluta- 
mine dependence to the associated carboxy-terminal NAD 
synthetase domains. We therefore expect to find that the 
presence of branch 7 nitrilase-related domains will correlate 
with the ability of purified prokaryotic NAD synthetases to 
use glutamine, and we expect that the glutamine dependence 
of prokaryotic and eukaryotic glutamine-dependent NAD 
synthetases will depend on nitrilase-homologous active-site 
residues. If this is confirmed, branch 7 and 8 nitrilase 



domains will constitute the fourth independent type of GAT 
domain to participate in coupled amino-transfer reactions. 

Enzymology 

Nonenzymatic hydrolysis of a nitrile of the form R-teN 
would produce the corresponding acid amide, R-O0(NH2), 
with one water addition and the corresponding acid, R-CO2-, 
with the second water addition. Nitrilases are interesting, 
however, in that the substrates are nitriles but the reaction 
does not involve release of, or reaction with, a substantial 
amount of the corresponding amide [1,25]. Nitrilases 
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produce the acid without the production or release of an acid 
amide by virtue of covalent, thiol-mediated catalysis [5,25]. 
As illustrated in Figure 1, the enzyme attacks a nitrile sub- 
strate covalently, producing ammonia with the first water 
addition, and producing acid and a regenerated enzyme with 
the second water addition. The geometric constraints of this 
reaction suggest that nitrilase facilitates interaction with a 
linear (approximately 180 0 ) substrate, planar (approxi- 
mately 120 0 ) thioimidate and acylenzyme intermediates, and 
tetrahedral (approximately 109. 5 0 ) water-bonded intermedi- 
ates. In contrast, serine and thiol proteases and amidases are 
confined to interacting with planar substrates and tetrahe- 
dral intermediates. We speculate that most nitrilases bind 
strongly to a bulky substrate R group in a conformation that 
places the 2 carbon closer to 120 0 than to 18 o° from the 
cyano nitrogen. Fitting a distorted substrate nitrile would 
push the substrate toward thioimidation and would reduce 
the geometric sweeps required of enzyme complexes. In 
support of this view, most nitrilases prefer bulky substrates 
to nonsubstituted acetonitrile [1,25-28]. Cyanide hydratase, 
a member of the nitrilase branch, may be the exception that 
proves the rule: the R-group free substrate does not stay 
bound to produce acid but rather is decomposed to for- 
mamide after one water addition [29,30]. 

As we have discussed, most branches of the nitrilase super- 
family do not contain nitrilases but rather amide-hydrolyzing 
or amide-condensing enzymes. Although activation of water 
to attack planar intermediates is expected to be shared by all 
enzymatically active members of the superfamily, the bio- 
chemical basis for nitrile versus amide attack within the nitri- 
lase superfamily is not yet understood. Biotinidases, branch 4 
of the nitrilase superfamily, are amidases specific for hydroly- 
sis of biotinamides such as biocytin to biotin plus lysine [31]. 
For this branch, leaving group specificity allows biotinylated 
peptides, biocytin, simple biotinamide and biotin esters to be 
substrates [31]. As alcohols are better leaving groups than 
amines, it would not be surprising if other members of the 
nitrilase superfamily have a biological function as esterases. 
Although no member of the nitrilase superfamily has been 
reported to have protease activity, members of branches 3 
and 4 act on sidechains of polypeptides and members of 
branch 9 perform a condensation to polypeptide amino 
termini. Because branch 12 enzymes are fused to a probable 
amino-terminal acetyltransferase, they may have protein sub- 
strates as well. Protease activities may remain to be discov- 
ered in the superfamily. The enzyme activities of the nitrilase 
superfamily are summarized in Table 1. 

Structural features 

Crystal structures of an iV-carbamyl-D-amino acid amidohy- 
drolase from Agrobacterium [13] (a carbamylase; branch 6) 
and the Caenorhabditis elegans NitFhit Rosetta Stone 
protein [12] (branch 10) have been determined. The nitrilase- 
homologous domain of NitFhit and the carbamylase have 



similar three-dimensional structures, conserved chemical 
features, and were independently interpreted as utilizing the 
conserved glutamate residue as a general base for the cys- 
teine nucleophile [12,13]. The Nit domain of NitFhit and the 
carbamylase can be described as a-p-p-oc sandwich proteins, 
both of which assemble as tetramers. Nit and the carbamy- 
lase are unrelated to other enzymes with known structures 
such as Ntn and triad hydrolases [10], and thiol proteases 
[16]. Figure 4 shows the geometry of the Nit active site, high- 
lighting residues that are absolutely conserved in the super- 
family (Glu54, Lysi27 and CVSI69) and residues at positions 
that are highly conserved (Tyri25, Hisi29, Tyn.70, Aspi7i, 
Argi73 and Phei74), as aligned in Figure 2. 

Branches of the nitrilase superfamily 
Branch I : nitrilase 

Members of the nitrilase branch (EC 3.5.5.1) are found in 
plants, animals (C. elegans), fungi (Saccharomyces cere- 
visiae's frequently inactivated NITi gene), and many types of 
bacteria. The best evidence that nitrilase functions in vivo to 
convert indoleacetonitrile to the plant growth factor indole-3- 
acetic acid (auxin) comes from Arabidopsis, in which it was 
shown that recessive mutations in a nitrilase gene resulted in 
reduced sensitivity to the auxin-like effects of indoleacetoni- 
trile and that overexpression of a nitrilase caused increased 
sensitivity to indoleacetonitrile [32]. Bacterial nitrilases are 
often exploited for biochemical syntheses and for environ- 
mental remediation [33]. It is not clear whether bacterial 
nitrilases primarily function in ecological relationships with 
plants or whether they benefit isolated microbes. 

Branch 2: aliphatic amidase 

Aliphatic amidases (EC 3.5.1.4) [3,4] comprise a small 
branch of nearly identical proteins found in Pseudomonas, 
Bacillus, Brevibacteria and Helicobacteria. They hydrolyze 
substrates such as the carboxamide sidechains of glutamine 



Figure 4 

Nitrilase-related active site of C elegans NitFhit. Stereoview 
of sidechains of invariant and highly conserved residues from 
the crystal structure of NitFhit [12]. 
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and asparagine utilizing the conserved cysteine within the 
nitrilase superfamily. 

Branch 3: amino-terminal amidase 

The N-end rule is a means by which the rates of ubiquitin- 
dependent protein degradation is regulated. The S. cere- 
visiae Ntai protein deaminates amino-terminal asparagine 
and glutamine residues to aspartate and glutamate, which 
lead to more rapid rates of protein turnover [34]. Ntai has 
fungal homologs but mammalian amino-terminal amidases 
appear to be unrelated. 

Branch 4: biotinidase 

Biotinidases (EC 3.5.1.12) utilize specific amidase/esterase 
activity to release biotin from biotinamide, biotin-lysine and 
biotin-peptide conjugates and biotin methylester [35]. Bio- 
tinidase deficiency can result in an inability to recycle biotin 
that is manifested in neurological and cutaneous abnormali- 
ties in humans [36]. Biotinidases are secreted into serum and 
have a unique, conserved carboxy-terminal domain. Vanins 
[37] arid GPI-80 [38] are members of the biotinidase branch 
that contain a similar carboxy-terminal domain containing, 
in addition, a GPI anchor and are involved in T-cell thymic 
homing and neutrophil adherence and migration. One 
member of this branch contains repeated nitrilase-related 
domains. Recently, porcine panthetheinase (EC 3.5.1.-), an 
amidase that converts pantetheine to panthothenate plus cys- 
teamine in the dissimilative pathway of CoA, was sequenced 
and found to be nearly identical to vanins [39]. Although the 
biologically important substrate of vanins remains unproven, 
sequence and enzymatic similarity with biotinidases suggest 
that an amine molecule at least the size of an amino acid (that 
is, bigger than ammonia) may be the leaving group. Branch 4 
enzymes are the only amidases in the nitrilase superfamily 
known to prefer secondary amine substrates of the form 
R-C=0(NHR') as opposed to simple acid amides. An exten- 
sive archive of vanins, including 118 expressed sequence tag 
(EST) sequences is available [40,41]. 

Branch 5: p-ureidopropionase 

The p-ureidopropionases (EC 3.5.1.6) are enzymes involved 
in the catabolism of pyrimidine bases and the production of 
p-alanine [42]. Substrates of this enzyme are of the carbamy- 
lase type (see Figure ic) and the amine product is usually a 
non-standard amino acid. 

Branch 6: carbamylase 

A variety of bacteria express hydrolases specific for the 
decarbamylation of D-amino acids. These enzymes have 
been exploited in the production of semisynthetic p-lactam 
antibiotics [43] and are now represented by the structure of 
the Agrobacterium enzyme [13]. 

Branches 7 and 8: glutamine-dependent NAD synthetase 

As discussed earlier, the presence of a nitrilase-related 
domain appears to correlate with the ability of bacterial NAD 



synthetase (EC 6.3.5.1) to utilize glutamine as an ammonia 
source. Eukaryotic NAD synthetases always contain this 
novel, putative GAT domain and exhibit glutamine depen- 
dence. Substrate specificity of nitrilase-related proteins as 
glutamine amidases is not surprising given the specificity of 
the branch 2 and 3 enzymes. It remains to be seen how glut- 
amine-dependent NAD synthetase may channel ammonia 
from the nitrilase-related active site to the NAD active site. 

Branch 9: apolipoprotein N-acyltransferase 

The modification and processing of Braun's lipoprotein, a 
major component of the outer membrane of E, coli, has been 
studied for decades [44]. Defects in this post-translational 
modification pathway are associated with copper sensitivity 
[45]. The apolipoprotein becomes proteolized, exposing an 
amino-terminal cysteine. After the cysteine is modified by 
diacylglycerol, branch 9 enzymes condense a fatty acid to the 
amino terminus of the modified cysteine residue. 

Branch 10: Nit 

Nit was originally identified as an approximately 300 amino 
acid amino-terminal extension on fly and worm homologs 
[46] of the human [47] and murine [48] Fhit tumor suppres- 
sor protein. Nit homologs are found in organisms with Fhit 
homologs [12] and, in the mouse, Niti and Fhit mRNA levels 
are highly correlated in seven of eight tissues examined [46]. 
Satisfaction of these criteria suggested that NitFhit is a 
Rosetta Stone protein, whose fusion might decode a previ- 
ously unsuspected interaction between the proteins [18,19]. 
As Fhit is part of a cell-death pathway that is not clearly con- 
nected to known apoptotic players [49,50], identification of 
Nit as a Fhit-interacting protein was welcomed. The Fhit 
active site of NitFhit has been characterized and the struc- 
ture of worm NitFhit has been elucidated [12], but the Nit 
substrate, cell biology and relationship to tumor suppression 
are not known. 

The most striking feature of the Nit-Fhit interaction appar- 
ent from the crystal structure of the worm protein is that the 
complex assembles with a central Nit tetramer binding two 
Fhit dimers [12]. The carboxy-terminal p strands of Nit-con- 
served polypeptide sequences exit the compact Nit tetramer 
domain and physically interact with Fhit dimers. Fhit dimers 
are bound in a way that allows them to expose diadenosine 
polyphosphate-binding sites opposite from the Nit interac- 
tion surface [12]. Futhermore, the nucleotide kinetics of 
NitFhit active sites [12] were extremely similar to those of 
human Fhit dimers in the absence of Nit [51]. 

Concord between the phylogenetic profiles [52] of Fhit and 
Nit breaks down slightly with the discovery of Nit-related 
sequences in a small number of prokaryotes that have no 
Fhit homolog (see the Additional data file). The idea that 
nitrilase-related proteins spread from animals and plants to 
prokaryotes is, however, supported by the animal-associated 
ecology of these microbes. 
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Branches 11-13 

Branches n and 12 contain distinct similarity groups with no 
characterized member. Branch 12 may contain Rosetta Stone 
[18,19] proteins in that a distinctive nitrilase-related domain 
is found fused to an amino-terminal domain of approxi- 
mately 210 amino acids. The branch-12-associated domain is 
related to the RimI [53] superfamily of amino-terminal 
acetyltransferases, suggesting that branch 12 enzymes are 
involved in post-translational modifications. Branch 13 con- 
tains uncharacterized, nonfused nitrilase-related proteins 
that are difficult to place in a distinct similarity group. 



Conclusions 

On the basis of newly obtained structures of nitrilase-related 
proteins and the available literature, we have provided a 
classification of all available nitrilase-related sequences. 
Every activity appears to work through a thiol acylenzyme 
intermediate and depend on a novel Glu-Lys-Cys catalytic 
triad. No activity forms or hydrolyzes a peptide bond, yet 
several affect post-translational modifications of lysine or 
carboxyamide sidechains or polypeptide amino termini. 
Other activities are involved in natural product biosynthesis 
and other metabolic pathways. Activities on amide sub- 
strates are found in at least eight branches of the superfam- 
ily. Activity on nitrile substrates has only been found in one 
branch. Membership in branches, based on BLAST E-value 
and structure-based signature sequence analysis, appears to 
correlate well with distinct substrate specificity and biologi- 
cal activities in all branches for which experimental data are 
available. Fusions between nitrilase-related domains and 
other conserved sequences are extremely common in the 
nitrilase superfamily. Fusions with NAD synthetase domains 
are here interpreted as solving a 30 year old problem: two 
branches of the nitrilase superfamily are posited to be novel 
GAT domains that account for the glutamine dependence of 
some bacterial and all eukaryotic NAD synthetases. 



Additional data 

The following additional data file is available (in HTML 
format): links for the 176 sequences in the 13-branch classifi- 
cation system of the nitrilase superfamily. The additional data 
file can be accessed from: http://genomebiology.com/ 
200i/2/i/reviews/oooi/gb-2ooi-2-i-reviewsoooi-Si.asp 
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